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Abstract. We review selected methods of the Cosmic Microwave Background data 
analysis appropriate for the analysis of the largest currently available data sets. 
We focus on techniques of the time-ordered data manipulation and map making 
algorithms based on the maximum-likelihood approach. The presented methods 
have been applied to the maxima-i data analysis (Hanany et al. 2000) and the 
description of the algorithms is illustrated with the examples drawn from that 
experience. The more extensive presentation of the here-mentioned issues will be 
given in the forthcoming paper (Stompor et al. 2001). 



1 Introduction 



The growing size and complexity of the available CMB data sets poses a diffi- 
cult challenge for the data analysis (e.g. Borrill (1999), Borrill et al. (2000)). 
The largest already existing data sets (e.g. boomerang-ldb, maxima-i) 
consist of up to many million of the time measurements sampling areas of 
the sky of up to many thousands of the beam-size pixels. The analysis of 
such a big data set is usually divided into three distinct stages focusing in 
turn on the time-ordered data, maps and power spectra. On each stage the 
size of the data item to be manipulated is significantly reduced. However, a 
successful implementation of such a program requires that the compression 
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performed on each stage should should not compromise the cosmological in- 
formation contained in the data. The algorithms presented in the following 
are designed to be optimal - or nearly optimal - in a maximum-likelihood 
sense under the assumptions of the Gaussianity and stationarity of the in- 
strumental noise. 

We start from the description of the time-ordered data manipulation tech- 
niques which prove to be necessary to ensure the efficiency and applicability of 
the map making algorithms. Then we briefly review a variety of map making 
algorithms highlighting involved assumptions and assessing their efficiency. 

2 Noise estimation and gaps filling 

In the usual map-making methodology the noise power spectra are taken 
to be given with an arbitrary high precision. The circumstances in which 
actual CMB data are collected are specific enough to require the noise es- 
timation to be evaluated directly from the time-ordered data. The robust 
noise-estimation procedure therefore needs to be capable of efficient dealing 
with the undesirable effects present in the real data. 

The common feature of the realistic time streams are short breaks (gaps) 
in their continuity caused e.g. by the cosmic rays or other transients in the 
experimental apparatus. Their presence not only prevents a straightforward 
application of the Fast Fourier techniques (FFT) to the noise estimation but 
also introduces a whole suite of subtle problems related to their treatment on 
the level of the time-ordered data manipulation. They also appear to be one 
of major obstacles in an implementation of the efficient and fast map mak- 
ing algorithms. Therefore restoration of the continuity of the time-ordered 
data - in a statistically correct way - stands out as one of the important 
goals of the data analysis on the time domain stage. A required algorithm 
needs simultaneously to estimate the noise power spectrum and fill the gaps 
with pure Gaussian noise realization without compromising its correlations 
throughout the length of the time stream. As described in detail in Stompor 
et al. (2001) that can be achieved iteratively when on each step of the itera- 
tions the gaps are filled with the constrained realization (Hoffman & Ribak 
1991) of the Gaussian noise, a power spectrum of which has been determined 
on the preceding step of the iterations using standard FFT methods. 

The noise estimation procedure attempts to estimate a noise ensemble 
average power spectrum using just one realization of the time stream. That 
is clearly not sufficient. As a result of the above sketched algorithm one 
usually ends up with the reliable estimation of the ensemble average noise 
power spectrum in time domain at sufficiently high frequencies (for MAXIMA- 
I for / £ 0.1Hz), but the low frequency end is a subject to a non-negligible 
sampling error. To minimize the effect of the low frequencies we marginalize 
over the low frequency part of the spectrum. No significant dependence on 
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the assumed low frequency cut-off (in the range from 0.01 Hz up to 0.4 Hz) 
has been found in the MAXIMA-I case. 

Adding randomly drawn signal to the data introduces an extra freedom 
and may undermine the uniqueness of the results. Though depending on the 
particular noise realization the results may somewhat differ, all of them are 
statistically equivalent by construction, and no bias is introduced. Moreover, 
that extra randomness can be minimized by introducing a fictitious gap pixel 
to the recovered map. As a result we find that the final maps and their power 
spectra for MAXIMA-I are robust and not affected by that procedure. 

So far, for simplicity we have been assuming that the time-ordered data 
are noise dominated, however, an extension of the described algorithm for 
the more general case is straightforward (Ferreira & Jaffe 2000). 

3 Map making 

The general algebra for the maximum-likelihood map making under the as- 
sumption of the Gaussian correlated noise can be found in e.g. Tegmark 
(1997). In short, if the time stream data d can be modeled as 

d = Am + n, (1) 

where d is the vector of the measurements for the given chunk of the time 
stream, A a pointing matrix assigning each of the time samples to an appro- 
priate pixel in the sky and n - a time stream noise - is a vector of the random 
Gaussian variables with correlations as given by N t - a time domain corre- 
lation matrix, then a (maximum-likelihood) map m and its noise-correlation 
matrix in pixel domain (N p ) are given by, 

m = (ADMD T A T y 1 ADMd (2) 
N p = (ADMD T A T y 1 (ADMN t MD T A T ) (ADMD T A T y 1 (3) 

here M is a positive definite symmetric matrix. D is a time domain represen- 
tation of the filters applied to the time stream on the previous stages (e.g. a 
prewhitening filter) assumed to be orthogonal DD T = 1. 

If M = N^~ then m is a minimum variance map. The other choices 
can sacrifice the optimality but being computationally faster. In particular 
Tegmark (1997) proposed to choose as M only a circulant part of the Nt 
matrix. In the following we discuss a number of approaches and their appli- 
cations to a realistic maxima- i- like data set. 

3.1 Minimum variance variants 

Exact implementation. The implementation of the exact minimum vari- 
ance estimator of the map seems a daunting task. The time stream length 
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(nt) m ay easily reach many millions of the time samples making an inver- 
sion of the time domain noise correlation matrix - an "nj-cubed" process - 
prohibitive. However, for a maxima-I like experiment with the time stream 
chunks of the length reaching "only" up to 2.5 x 10 5 time samples we found 
that the implementation is feasible even on a moderately large workstation. 

At the heart of that implementation lies a simple observation that the 
time domain correlation matrix of the stationary and continues time stream 
is Toeplitz e.g. N t (i,j) = Nt(i— j, 0). The inversion of the the Toeplitz 
matrix can be performed in as few as n 2 operations. A clearly feasible task 
for ~ 10 time samples. (An even faster algorithm exists (e.g. Golub & van 
Loan 1983) bringing that number down to ~ n t log 2 n t .) An extra gain in a 
number of operations can be achieved if the noise-correlation length is shorter 
than the time stream length and the noise-correlation matrix band-diagonal. 

The actual time streams are commonly strewn with the gaps and a Toeplitz 
character of the correlation matrix lost. However the gap filling algorithm, 
presented earlier, reconstructs the continuity of the time stream and its sta- 
tionarity. To minimize the entirely spurious content of the gaps being added 
to the map all the time samples from the gaps are directed to an extra ficti- 
tious pixel ("a gap pixel") which is subsequently rejected (marginalized over) 
from the map and the pixel domain noise-correlation matrix. 

The computational effort for the case when D = 1 scale with the number 
of pixels n p i X and the number of time samples nt as: 

• noise inverse in time domain: N t — ► N^ 1 : oc n 2 ; 

• noise inverse in pixel domain: N^ 1 , A — ► AN^ 1 A T : oc n 2 ; 

• noise weighted map: Nf , A, d — ► AN 1 T l d : oc n 2 ; 

• noise matrix in pixel domain: N p — ► N^ 1 : oc n pix ; 

• final map: N~ x , AN 1 T l d — ► N^AN^d oc n 2 pix . 

In the case of the first three items from the list the substantial savings can 
be made if the fact that the inverse noise-correlation matrix is assumed to 
be band-diagonal - an approximation usually well fulfilled for an inverse of 
a band-diagonal noise-correlation matrix. If the noise-correlation matrix is 
sparse then the additional savings can also be made. 

Memory-wise clearly there is only need to keep a first row of the N t 
matrix and in a typical situation the major requirement would be then set 
by the size of the ANf 1 matrix which is oc n P i X nt and the size of the noise 
correlation for the entire map oc n pix . Note that the first of these limits can 
be alleviated at the expense of the operation counts (there is no need to keep 
all n P i X rit matrix at the same time but then some of the parts may have to 
be recomputed a multiple number of times). 

Approximate approach. The way to speed up map making but still aiming 
at the optimal minimum variance map was proposed by Ferreira & Jaffe 
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(2000). The approximation those authors favor assumes that, 



f N^(i,j) if \i-j\ <min(n c /2,y 
[ 0, otherwise. 



(4) 



Here n c and l c are the time chunk and correlation lengths respectively, and a 
subscript C denotes a circulant part of the noise correlation matrix defined 
as N c (i, j) = N c (i- j, 0) = N c (n c - i, 0) for < i < n c /2. 

This approach is designed to perform the N t inversion with the "FFT" 
speed providing at the same time a good approximation to the exact solution 
of the previous section. By construction, however, it can be only exact in the 
absence of the noise correlations. 

The operation count changes only for two, but the most time consuming, 
items from the list in the previous paragraph which now read, 

• noise inverse in time domain: N t — ► Nj~ : cx n t logn*. 

• noise weighted map: , A, t — ► ANf t : oc n t \ogn t ; 

The scaling of the noise weighted map computation is given assuming using 
FFT and a Toeplitz matrix property (Golub & van Loan 1983). 

The required memory is set by the size of the noise matrix in pixel domain 
oc n 2 pix . It is apparent that if the efficient fast ("n t log 2 n t ") implementation 
of the Toeplitz matrix inversion is available then the major computational 
advantage of the approximate method over the exact one would vanish and 
only the memory requirement would remain as its main asset. 

This approximation is implemented in the very successful, publicly avail- 
able package called MADCAP by Julian Borrill (e.g. Borrill et al. 2000). 

3.2 Circulant variants 

In this case M = Nq^ and the noise-correlation matrix in pixel domain can 
be expressed as follows (Tegmark 1997), 



+ (ADN^D T A T ) 1 (ADN c lN st N^D T A T ) (ADN^D T A T ) 1 , 



where we decomposed explicitly a noise-correlation matrix in time domain 
into its circulant (Net) and sparse parts (N$t), N t = Net + N$t- The sparse 
part of the matrix corrects for the presence of "corners" of the circulant ma- 
trix and the gaps in the time stream. As advocated by Tegmark (1997) such a 
matrix should be really sparse and a number of the required operation greatly 
decreased over the exact approach described above. In the implementation of 
that author the D filters are additionally used to enhance sparseness. 

The price to pay for it is not only a non-optimal map as a result (the loss 
of the precision is hoped not to be substantial here), but also definitely more 



N p ee (ADN c lD T A T ) 1 + 



(5) 
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complicated algebra to be implemented especially in a presence of large num- 
ber of gaps in the time stream. If no assumption about the band-diagonality 
is made then the computational costs scale as follows: 

• noise inverse in time domain: N t — ► N^ 1 : oc n t logn t ; 

• noise inverse in pixel domain: 

circulant part: Nf x ,A — > ANf 1 A T : oc n t 2 ; 
sparse part: N st , M,A — ► AMN st MA T : 

• noise weighted map: Nf x ,A, d — > AN^ 1 d : oc n t logn t ; 

• noise matrix in pixel domain: N p — ► N^ 1 : oc v? vix . 

We have omitted the filter matrix D in above expressions assuming that 
both M and D are circulant and therefore their product can be computed 
with the "FFT speed". The effort is dominated usually by the sparse part 
computation (n t > n p i X ) though the above scaling includes only the fastest 
growing term which can not be suppressed if only the band diagonality of the 
N t matrix is assumed. Memory required for performing that task is oc n P i X n t . 

Clearly if no more approximations are made this approach is far from 
competitive with any of those discussed above. One possible way of improving 
on that is to recognize the fact that if Net is band-diagonal (plus of course 
non-zero corners to fulfill the circulancy criterion) then also N^, D T N^l 
and D t NqID are usually approximately band-diagonal. 

Another possible approximation, referred to hereafter, is to neglect com- 
pletely the sparse correction in the expression for the noise-correlation matrix 
in the pixel domain. Clearly the operational cost goes down to n 2 or r? pix - 
whichever larger - and is usually oc n pix especially if a band diagonal char- 
acter of the inverse of N^ t is assumed. In such a case the method achieves 
the performance comparable with that of Ferreira & Jaffe (2000) . 

3.3 Comparison and assessment 

In the paper by Stompor et al. (2001) we show that though none of discussed 
above approximations reproduces perfectly the map and the noise-correlation 
matrix in the pixel domain (N p ), it preserves the statistical information con- 
tained in the data - e.g. 2-point correlation properties - correctly. In fact 
we find that both the approximations, as described above, fare very well in 
recovering the anisotropy power spectrum with no apparent systematic bias. 

Clearly the performance of these approximations needs to be further 
tested if other statistics are to be applied to the maps. 

Out of the two exact methods the minimum variance implementation de- 
scribed here performs generally much better. The very fact that the exact 
maximum-likelihood method can be applied to the data sets of the MAXIMA- 
I size is of importance for the further development, tests and validation of the 
approximate algorithms. The simple operation counts and memory require- 
ments are clearly great assets of those methods and major stumbling blocks 
for the exact approach to go far beyond the size of the current data sets. 
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It is worth noticing that the gaps filling procedure described in the be- 
ginning is important for all the described methods: in a case of the exact 
methods (optimal and circulant) it facilitates the practical feasibility of the 
implementation. For the approximate methods it improves the accuracy and 
robustness of the involved approximations. 

4 Summary 

We briefly have discussed here the making of the CMB temperature anisotropy 
maps from the time-ordered data. We have argued that for the data set with 
up to a few millions of time samples producing the maps of up to a few 
thousands of the pixels is readily possible without need for sacrificing the op- 
timal character of the final map. Moreover accurate and efficient approximate 
methods also exist. Those have been tested through the direct comparison 
with the exact method on the realistic simulations and the MAXIMA-I data 
set and can become a starting point for developing algorithms capable of 
coping with still bigger and more complicated data sets. It seems likely that 
the hard problem of this kind of data analysis will be accounting on all kinds 
of imperfectness of a real data set in the statistically sound way and without 
the substantial increase of the operational count, rather than solely a sheer 
size of a data set itself. As an example here we have demonstrated how to 
incorporate the presence of short discontinuities in the time-ordered data. 
The other common problems are to be discussed in Stompor et al. (2001). 

RS acknowledges support of NASA Grant NAG5-3941 and help of Polish 
State Committee for Scientific Research grant no. 2P03D01719. AHJ and 
JHPW acknowledge support from NASA LTSA Grant no. NAG5-6552 and 
NSF KDI Grant no. 9872979. PGF acknowledges support from the RS. BR 
and CDW acknowledge support from NASA GSRP Grants no. S00-GSRP- 
032 and S00-GSRP-031. 

References 

1. Hanany, S., et al. , (2000), maxima-i: A Measurement of the Cosmic Microwave 
Background Anisotropy on angular scales of 10 arcminutes to 5 degrees. Astro- 
phys. J. Letters, in press 

2. Stompor, R., et al. , (2001), in preparation 

3. Borrill, J., (1999) in EC-TMR Conference Proceedings, 476, 3K Cosmology, ed. 
L. Maiani, F. Melchiorri, & N. Vittorio (Woodbury, New York: AIP), 224 

4. Borrill, J., Ferreira, P.,C, Jaffe, A.,H., & Stompor, R. (2000), CMB Data Anal- 
ysis with MADCAP, this volume 

5. Hoffman, Y. & Ribak, E., (1991), Astroph. J. Letters, 380, 5 

6. Tegmark, M., (1997), CMB mapping experiments, Phys. Rev. D, 56, 4516 

7. Golub, H., & van Loan, F., (1983), Matrix operations, The Johns Hopkins Uni- 
versity Press 

8. Ferreira, P., G, & Jaffe, A.,H., (2000), Simultaneous estimation of noise and 
signal in cosmic microwave background experiments, M.N.R.A.S., 312, 89 



